翻訳と辞書 |
Data pre-processing : ウィキペディア英語版 | Data pre-processing Data pre-processing is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to data mining and machine learning projects. Data-gathering methods are often loosely controlled, resulting in out-of-range values (e.g., Income: −100), impossible data combinations (e.g., Sex: Male, Pregnant: Yes), missing values, etc. Analyzing data that has not been carefully screened for such problems can produce misleading results. Thus, the representation and quality of data is first and foremost before running an analysis.〔Pyle, D., 1999. ''Data Preparation for Data Mining.'' Morgan Kaufmann Publishers, Los Altos, California.〕 If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. Data preparation and filtering steps can take considerable amount of processing time. Data pre-processing includes cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. Kotsiantis et al. (2006) present a well-known algorithm for each step of data pre-processing.〔S. Kotsiantis, D. Kanellopoulos, P. Pintelas, "Data Preprocessing for Supervised Learning", ''International Journal of Computer Science'', 2006, Vol 1 N. 2, pp 111–117.〕 ==References==
抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Data pre-processing」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|